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Abstract 

Background: Several disease-specific questionnaires to measure pain and disability in patients with neck pain have 
been translated. However, a simple translation of the original version doesn't guarantee similar measurement 
properties. The objective of this study is to critically appraise the quality of the translation process, cross-cultural 
validation and the measurement properties of translated versions of neck-specific questionnaires. 

Methods: Bibliographic databases were searched for articles concerning the translation or evaluation of the 
measurement properties of a translated version of a neck-specific questionnaire. The methodological quality of the 
selected studies and the results of the measurement properties were critically appraised and rated using the 
COSMIN checklist and criteria for measurement properties. 

Results: The search strategy resulted in a total of 3641 unique hits, of which 27 articles, evaluating 6 different 
questionnaires in 15 different languages, were included in this study. Generally the methodological quality of the 
translation process is poor and none of the included studies performed a cross-cultural adaptation. A substantial 
amount of information regarding the measurement properties of translated versions of the different neck-specific 
questionnaires is lacking. Moreover, the evidence for the quality of measurement properties of the translated 
versions is mostly limited or assessed in studies of poor methodological quality. 

Conclusions: Until results from high quality studies are available, we advise to use the Catalan, Dutch, English, 
Iranian, Korean, Spanish and Turkish version of the NDI, the Chinese version of the NPQ, and the Finnish, German 
and Italian version of the NPDS. The Greek NDI needs cross-cultural validation and there is no methodologically 
sound information for the Swedish NDI. For all other languages we advise to translate the original version of the 
NDI. 



Background 

Several disease-specific questionnaires have been devel- 
oped to measure pain and disability in patients with 
neck pain (e.g. Neck Disability Index (NDI), Neck Pain 
and Disability Scale (NPDS)) [1,2]. To make them suita- 
ble for use in other languages, several of these neck-spe- 
cific questionnaires have been translated. However, a 
simple translation of the original version doesn't guaran- 
tee similar measurement properties, because differences 
in cultural context have to be taken into account as well 
[3,4]. 
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Previous reviews of neck-specific questionnaires have 
not paid sufficient attention to possible differences in 
performance, caused by differences in cultural context, 
and combine the results of studies that evaluate mea- 
surement properties of different language versions of the 
same questionnaire [5,6]. This may lead to inconsistent 
results for measurement properties, as was demon- 
strated in a recent review of the cross-cultural adapta- 
tions of the McGill Pain Questionnaire [7]. 

Since it is possible that the measurement properties of 
neck-specific questionnaires vary between different 
nationalities, we decided to evaluate them per language. 
This reduces inconsistency in results due to cultural dif- 
ferences and also facilitates a choice for the best ques- 
tionnaire per language. The measurement properties of 
original versions of the different neck-specific 
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questionnaires were evaluated in a separate systematic 
review. (Schellingerhout JM, Heymans MW, Verhagen 
AP, De Vet HC, Koes BW, Terwee CB: Measurement 
properties of disease-specific questionnaires in patients 
with neck pain: a systematic review, submitted) 

The purpose of this study is to critically appraise the 
quality of the translation process, cross-cultural valida- 
tion and the measurement properties of translated ver- 
sions of neck-specific questionnaires. 

Methods 

Search strategy 

We searched the following computerised bibliographic 
databases: Medline (1966 to July 2010), EMbase (1974 
to July 2010), CINAHL (1981 to July 2010), and Psy- 
cINFO (1806 to July 2010). We used the index terms 
"neck", "neck pain", and "neck injuries/injury" in com- 
bination with "research measurement", "question- 
naire", "outcome assessment", "psychometry", 
"reliability", "validity", and derivatives of these terms. 
The full search strategy used in each database is avail- 
able upon request from the corresponding author. 
Reference lists were screened to identify additional 
relevant studies. 

Selection criteria 

A study was included if it was a full text original article 
(e.g. not an abstract, review or editorial), published in 
English, concerning the translation or evaluation of the 
measurement properties of a translated version of a 
neck-specific questionnaire. The questionnaire had to be 
self-reported, evaluating pain and/or disability, and spe- 
cifically developed or adapted for patients with neck 
pain. 

For inclusion, neck pain had to be the main complaint 
of the study population. Accompanying complaints (e.g. 
low back pain or shoulder pain) were no reason for 
exclusion, as long as the main focus was neck pain. Stu- 
dies considering study populations with a specific neck 
disorder (e.g. neurological disorder, rheumatological dis- 
order, malignancy, infection, or fracture) were excluded, 
except for patients with cervical radiculopathy or whi- 
plash associated disorder (WAD). 

Two reviewers (JMS, APV) independently assessed the 
titles, abstracts, and reference lists of studies retrieved 
by the literature search. In case of disagreement between 
the two reviewers, there was discussion to reach consen- 
sus. If necessary, a third reviewer (HCV) made the deci- 
sion regarding inclusion of the article. 

Measurement properties 

The measurement properties are divided over three 
domains: reliability, validity, and responsiveness [8]. In 
addition, the interpretability is described. 



Reliability 

Reliability is defined as the extent to which scores for 
patients who have not changed are the same for 
repeated measurement under several conditions: e.g. 
using different sets of items from the same question- 
naire (internal consistency); over time (test-retest); by 
different persons on the same occasion (inter-rater); or 
by the same persons on different occasions (intra-rater) 
[8]. 

Reliability contains the following measurement prop- 
erties: 

- Internal consistency: The interrelatedness among 
the items in a questionnaire, expressed by Cron- 
bach's a or Kuder-Richardson Formula 20 (KR-20) 
[8,9]. 

- Measurement error: The systematic and random 
error of a patient's score that is not attributed to 
true changes in the construct to be measured, 
expressed by the standard error of measurement 
(SEM) [8,10]. The SEM can be converted into the 
smallest detectable change (SDC) [10]. Changes 
exceeding the SDC can be labeled as change beyond 
measurement error [10]. Another approach is to cal- 
culate the limits of agreement (LoA) [11]. For deter- 
mining the adequacy of measurement error the SDC 
and/or LoA is related to the minimal important 
change (MIC) [12]. 

- Reliability: The proportion of the total variance in 
the measurements which is due to 'true' differences 
between patients [8]. This aspect is reflected by the 
Intraclass Correlation Coefficient (ICC) or Cohen's 
Kappa [8,13]. 

Validity 

Validity is the extent to which a questionnaire measures 
the construct it is supposed to measure and contains 
the following measurement properties [8]: 

- Content validity: The degree to which the content 
of a questionnaire is an adequate reflection of the 
construct to be measured [8]. Important aspects are 
whether all items are relevant for the construct, aim, 
and target population and if no important items are 
missing (comprehensiveness) [14]. 

- Criterion validity: The extent to which scores on 
an instrument are an adequate reflection of a gold 
standard [8]. Since a real gold standard for health 
status questionnaires is not available, [14] we will 
not evaluate criterion validity. 

- Construct validity is divided into three aspects: 

♦ Cross-cultural validity: The degree to which 
the performance of the items on a translated or 
culturally adapted instrument are an adequate 
reflection of the performance of the items of the 
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original version of the instrument [8]. This is 
assessed by means of multi-group factor analysis 
or differential item functioning using data from a 
population that completed the questionnaire in 
the original language, as well as data from a 
population that completed the questionnaire in 
the new language. 

♦ Structural validity: The degree to which the 
scores of an instrument are an adequate reflec- 
tion of the dimensionality of the construct to be 
measured [8]. Factor analysis should be per- 
formed to confirm the number of subscales pre- 
sent in a questionnaire [14]. 

♦ Hypothesis testing: The degree to which a parti- 
cular measure relates to other measures in a way 
one would expect if it is validly measuring the 
supposed construct, i.e. in accordance with pre- 
defined hypotheses about the correlation or dif- 
ferences between the measures [8]. 

Responsiveness 

Responsiveness is the ability of an instrument to detect 
change over time in the construct to be measured [8]. 
Responsiveness is considered an aspect of validity, in a 
longitudinal context [14]. Therefore, the same standards 
apply as for validity: the correlation between change 
scores of two measures should be in accordance with 
predefined hypotheses [14]. Another approach is to con- 
sider the measurement instrument as a diagnostic test 
to distinguish improved and non-improved patients. The 
responsiveness of the instrument is then expressed as 
the area under the receiver operator characteristic curve 
(AUC) [14]. 
Interpretability 

Interpretability is the degree to which one can assign 
qualitative meaning to quantitative scores [8]. This 
means that investigators should provide information 
about clinically meaningful differences in scores between 
subgroups, floor and ceiling effects, and the MIC [14]. 
Interpretability is not a measurement property, but an 
important characteristic of a measurement instrument 
[8]. 

Quality assessment 

Assessment of the methodological quality of the selected 
studies was carried out using the COSMIN checklist [9]. 
The COSMIN checklist consists of nine boxes with 
methodological standards for how each measurement 
property should be assessed. Each item was scored on a 
4-point rating scale (i.e. "poor", "fair", "good", or "excel- 
lent", see http://www.cosmin.nl). An overall score for 
the methodological quality of a study was determined by 
taking the lowest rating of any of the items in a box. 
The methodological quality of a study was evaluated per 
measurement property. Special attention was paid to the 
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methodological quality of the translation process and 
cross-cultural validation. The COSMIN box concerning 
this measurement property is presented in Table 1. 

Data extraction and assessment of (methodological) 
quality were performed by two reviewers (JMS, CBT) 
independently. In case of disagreement between the two 
reviewers, there was discussion in order to reach con- 
sensus. If necessary, a third reviewer (HCV) made the 
decision. 

Best evidence synthesis - levels of evidence 

To determine the overall quality of the measurement 
properties of the different questionnaires we synthesized 
the different studies per language by combining their 
results, adjusted for methodological quality of the stu- 
dies and the consistency of their results. The possible 
overall rating for a measurement property is "positive", 
"indeterminate", or "negative", accompanied by levels of 
evidence, similarly as was proposed by the Cochrane 
Back Review Group (see Table 2) [15,16]. 

To assess whether the results of the measurement 
properties were positive, negative, or indeterminate, we 
used criteria based on Terwee et al. (see Table 3) [17]. 

Results 

The search strategy resulted in a total of 3641 unique 
hits, of which 119 articles were selected based on their 
title and abstract. The full text assessment resulted in 
exclusion of another 68 articles. Reference checking did 
not result in additional articles. Twenty-four articles 
concerned original versions of neck-specific question- 
naires, which were evaluated in a separate systematic 
review. (Schellingerhout JM, Heymans MW, Verhagen 
AP, De Vet HC, Koes BW, Terwee CB: Measurement 
properties of disease-specific questionnaires in patients 
with neck pain: a systematic review, submitted) Finally, 
27 articles on translated questionnaires, evaluating 6 dif- 
ferent questionnaires in 15 different languages, were 
included in this study (see Figure 1). 

The general characteristics of these studies are pre- 
sented in Table 4. None of the included studies per- 
formed a cross-cultural validation (Table 1, items 14 
and 15), i.e. no studies performed multi-group factor 
analysis or differential item functioning. Therefore, we 
were only able to rate the methodological quality of the 
translation process (Table 1, items 4-11). The methodo- 
logical quality of the studies is presented in Table 5 for 
each measurement property, arranged per language. 
Generally the methodological quality of the studies was 
poor to fair. The synthesis of the results per question- 
naire and their accompanying level of evidence is pre- 
sented in Table 6 for each language. For each 
questionnaire, except for the Iranian NPDS and Spanish 
NDI, at least half of the information regarding 
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Table 1 Methodological criteria for the translation process and cross-cultural validation [9] 


Item 


Methodological Criteria 


1 


Was the? percentage of missi ncj items giveni 1 


2 


Was there a description of how missing items were handled^ 


3 


\A/s^ the S3mnlp si7P inrliirlprl in thp ansk/^is ?iHpni i^itp? 

V V CI J LI Id JOI 1 IUIC JlZ-C III LIICT dllOIVJlJ uUCUUuLC 


4 


Were both the original language in which the HR-PRO instrument was developed; 




and the anguage in which the HR-PRO instrument was translated described? 


5 


Was the expertise of the people invo ved in the trans ation process adequate y described? 




e.g. expertise in the disease(s) invo ved, in the construct to be measured, or in both anguages 


6 


Did the translators work independently from each other? 


7 


Were items translated forward and backward^ 


8 


Was there an adequate description of how differences between the original and 




translated versions were resolved? 


9 


Was the translation reviewed by a committee (e.g. original developers)? 


10 


Was the HR-PRO instrument pre-tested (e.g. cognitive interviews) to check interpretation, 




cultural relevance of the translation, and ease of comprehension? 


11 


Was the sample used in the pre-test adequately described? 


12 


Were the samples similar for all characteristics except language and/or cultural background? 


13 


Were there any important flaws in the design or methods of the study? 


1--1 


for CTT: Was confirmatory factor analysis performed? 


15 


for IRT: Was differential item function (DIF) between language groups assessed? 



CTT = Classical Test Theory, IRT = Item Response Theory 



measurement properties is lacking. Moreover, the evi- 
dence for the quality of measurement properties is 
mostly limited, due to methodological shortcomings of 
the included studies. 

Below we will discuss the results for the different 
questionnaires per language. The results regarding mea- 
surement properties from studies of poor methodologi- 
cal quality are not mentioned [18-24]. 

Catalan 

The NDI is the only neck-specific questionnaire that has 
been translated in Catalan [25]. The NDI was originally 
designed to measure activities of daily living (ADL) in 



Table 2 Levels of evidence for the overall quality of the 
measurement property [16] 



Level 


Rating 


Criteria 


strong 


+++ or 


Consistent findings in multiple studies of good 






methodological quality OR in one study of 






excellent 






methodological quality 


moderate 


++ or - 


Consistent findings in multiple studies of fair 






methodological quality OR in one study of 






good 






methodological quality 


limited 


+ or - 


One study of fair methodological quality 


conflicting 


+/- 


Conflicting findings 


unknown 


? 


Only studies of poor methodological quality 



[..] = reference number 

+ = positive result, - - negative result 



patients with neck pain [1]. The methodological quality 
of the translation process is poor [25]. Confirmatory fac- 
tor analysis showed that the NDI is not unidimensional 
and there is limited evidence that the NDI has a 2-factor 
structure [25]. Assuming a 2-factor structure, there is 
moderate positive evidence for internal consistency: 
Cronbach's a is 0.70 for "pain and interference with 
cognitive functioning" and 0.83 for "functional disability" 
[25]. There is a positive correlation (r = 0.51) between 
the NDI and the Pain Intensity Index [25]. 

The available evidence on measurement properties of 
the Catalan NDI is positive, despite the poor methodo- 
logical quality of the translation process. 

Chinese 

The Northwick Park Neck Pain Questionnaire (NPQ) is 
the only neck-specific questionnaire that has been trans- 
lated in Chinese [26-28]. The NPQ was originally 
designed to measure the influence of non-specific neck 
pain on daily activities [29]. The methodological quality 
of the translation process is poor [26]. 

There is strong positive evidence for the reliability of the 
NPQ (ICC = 0.95) [26]. Hypothesis testing resulted in 
moderate positive evidence for correlation between the 
NPQ and instruments measuring pain and physical func- 
tioning (r = 0.59-0.75) [26,27]. Differences in score 
between subgroups have been reported (e.g. healthy per- 
sons vs. neck pain patients, and patients who sought medi- 
cal consultation vs. those who did not) [26]. The average 
time needed to fill out the NPQ is 5.5 minutes [26]. 
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Table 3 Quality criteria for measurement properties [Based on Terwee et al., [17]] 



Property 



Rating 



Quality Criteria 



Reliability 

hternal consistency 



Measurement error 



Reliability 



Validity 

Content validity 



Construct validity 

- Cross-cultural validity 



Structural validity 



Hypothesis testing 



Responsiveness 

Responsiveness 



(Sub)scale unidimensional AND Cronbach's alpha(s) > 0.70 
Dimensionality not known OR Cronbach's alpha not determined 
(Sub)scale not unidimensional OR Cronbach's alpha(s) < 0.70 
MIC > SDC OR MIC outside the LOA 
MIC not defined 

MIC < SDC OR MIC equals or inside LOA 
ICC/weighted Kappa > 0.70 OR Pearson's r > 0.80 
Neither ICC/weighted Kappa, nor Pearson's r determined 
ICC/weighted Kappa < 0.70 OR Pearson's r < 0.80 

The target population considers all items in the questionnaire to be relevant 
AND considers the questionnaire to be complete 
No target population involvement 

The target population considers items in the questionnaire to be irrelevant 
OR considers the questionnaire to be incomplete 

Original factor structure confirmed OR no important DIF 
Confirmation original factor structure AND DIF not mentioned 
Original factor structure not confirmed OR important DIF 
Factors should explain at least 50% of the variance 
Explained variance not mentioned 
Factors explain < 50% of the variance 

(Correlation with an instrument measuring the same construct > 0.50 OR 
at least 75% of the results are in accordance with the hypotheses) AND 
correlation with related constructs is higher than with unrelated constructs 
Solely correlations determined with unrelated constructs 
Correlation with an instrument measuring the same construct < 0.50 OR 

< 75% of the results are in accordance with the hypotheses OR 
correlation with related constructs is lower than with unrelated constructs 

(Correlation with an instrument measuring the same construct > 0.50 
OR at least 75% of the results are in accordance with the hypotheses 
OR AUC > 0.70) AND correlation with related constructs is higher 
than with unrelated constructs 

Solely correlations determined with unrelated constructs 

Correlation with an instrument measuring the same construct < 0.50 OR 

< 75% of the results are in accordance with the hypotheses OR AUC < 0.70 
OR correlation with related constructs is lower than with unrelated constructs 



[..] = reference number, MIC = minimal important change, SDC = smallest detectable change, LOA = limits of agreement, ICC = intraclass correlation coefficient, 
DIF - differential item functioning, AUC = area under the curve 
f + = positive rating, ? = indeterminate rating, - = negative rating 



The available information on measurement properties 
of the Chinese NPQ looks promising, despite the poor 
methodological quality of the translation process. 

Dutch 

The NDI, NPDS, and Neck Bournemouth Questionnaire 
(NBQ) have been translated in Dutch [19,29-31]. The 
NPDS was originally designed to measure pain and dis- 
ability in patients with neck pain [2]. The NBQ was 



originally designed to measure pain, physical function- 
ing, social functioning, and psychological functioning in 
patients with non-specific neck pain [32]. The transla- 
tion process of the NDI is not described, so the quality 
of this process is unknown. The methodological quality 
of the translation process of the NDPS is fair, [19] and 
of the NBQ is excellent [30]. 

There is limited positive evidence for the reliability of 
the NDI (ICC = 0.90), [31] and for responsiveness 
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(sensitivity = 0.9 and specificity = 0.7 for a clinically 
important change of 3.5) [29]. There is limited negative 
evidence for its measurement error (MIC = 3.5 and SDC = 
10.5 on a 0-50 scale) [29]. There is limited positive evi- 
dence for the reliability of the NBQ (ICC = 0.92) [30]. The 
result for measurement error of the NBQ is indeterminate, 
because the MIC is not defined [30]. No floor or ceiling 
effects have been detected for the NDI or NBQ, and for 
both questionnaires differences in score between sub- 
groups have been reported (men vs. women) [30,31]. 

The lack of information derived from these studies 
makes it difficult to point out the best available neck-spe- 
cific questionnaire in Dutch. Based on the information 
available on the measurement properties of the original 
version of the NDI and NBQ, we advise to use the Dutch 
NDI. (Schellingerhout JM, Heymans MW, Verhagen AP, 
De Vet HC, Koes BW, Terwee CB: Measurement proper- 
ties of disease-specific questionnaires in patients with 
neck pain: a systematic review, submitted) 

English 

The, originally Danish, Copenhagen Neck Functional 
Disability Scale (CNFDS) is the only neck-specific 



Table 4 General information per study 



Study Language Country Population Setting 



Nieto et al. [25] 


Catalan 


Spain 


< 3 months whiplash 


rehabilitation unit 


Chiu et al. [26] 


Chinese 


Hong Kong 


neck pain 


physiotherapist 


Lee et al. [27] 


Chinese 


Hong Kong 


neck pain 


physiotherapist 


Jorritsma et al. [19] 


Dutch 


Netherlands 


> 3 months non-specific neck pain 


rehabilitation unit 


Pool et al. [29] 


Dutch 


Netherlands 


non-specific neck pain 


general practitioner 


Schmitt et al. [30] 


Dutch 


Netherlands 


> 3 weeks whiplash 


general population 


Vos et al. [31] 


Dutch 


Netherlands 


< 6 weeks non-specific neck pain 


general practitioner 


Stewart et al. [33] 


English 


Australia 


> 3 months whiplash 


physiotherapist 


Salo et al. [35] 


Finnish 


Finland 


neck pain 


physiotherapist/rehabilitation unit 


Forestier et al. [18] 


French 


France 


> 3 months mechanical neck pain 


general population 


Martel et al. [37] 


French 


Canada 


> 12 weeks mechanical neck pain 


general population 


Wlodyka-Demaille et al. [36] 


French 


France 


> 15 days non-specific neck pain 


rehabilitation unit/rheumatologist 


Wlodyka-Demaille et al. [20] 


French 


France 


> 15 days non-specific neck pain 


rehabilitation unit/rheumatologist 


Bremerich et al. [24] 


German 


Switzerland 


> 3 months non-specific neck pain 


rheumatologist 


Scherer et al. [38] 


German 


Germany 


neck pain 


general practitioner 


Trouli et al. [39] 


Greek 


Greece 


non-specific neck pain 


primary care 


Agarwal et al. [40] 


Hindi 


India 


cervical radiculopathy 


physiotherapist 


Mousavi et al. [41] 


Iranian 


Iran 


non-specific neck pain 


primary care/physiotherapist 


Monticone et al. [42] 


Italian 


Italy 


> 4 weeks non-specific neck pain 


rehabilitation unit 


Lee et al. [43] 


Korean 


South Korea 


non-specific neck pain 


physiotherapist 


Andrade et al. [46] 


Spanish 


Spain 


non-specific neck pain 


rehabilitation unit 


Gonzalez et al. [44] 


Spanish 


Spain 


> 4 months non-specific neck pain 


physiotherapist 


Kovacs et al. [23] 


Spanish 


Spain 


non-specific neck pain 


primary care/hospital outpatient clinic 


Ackelman et al. [22] 


Swedish 


Sweden 


acute/chronic neck pain 


emergency room/physiotherapist 


Asian et al. [47] 


Turkish 


Turkey 


> 3 months non-specific neck pain 


physiotherapist/rehabilitation unit 


Bicer et al. [21] 


Turkish 


Turkey 


> 6 months non-specific neck pain 


rehabilitation unit 


Kose et al. [48] 


Turkish 


Turkey 


> 6 weeks non-specific neck pain 


primary care 



[..] = reference number 



Articles retrieved by 
search strategy (n-3641) 



Articles selected based on 
title and abstract (n=l 19) 



Articles selected based on 
full text (n=51) 



No. of articles included 
in review (n-27) per 
language: 

- Catalan (n- 1 ) 

- Chinese (n-2) 

- Dutch (n=4) 

- English (n-1) 

- Finnish (n— 1) 

- French (n-4) 

- German ( n-2) 

- Greek (n-1) 

- Hindi (n=l) 

- Iranian (n-1) 

- Italian (n-1) 

- Korean (n-1) 

- Spanish (n=3) 

- Swedish (n-1) 

- Turkish (n=3) 



J 



Main reason for exclusion: 

- article not retrievable (n— 2) 

- not full text original article (n-7) 

- validation not aim of study (n- 1 9) 

- neck pain not main complaint (n-14) 

- specific neck disorder (n-6) 

- not neck-specific questionnaire (n-20) 



Exclusion of original versions (n-24) 



Figure 1 Flowchart search and selection 
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Table 5 Methodological quality of each study per measurement property 



Language Translation 
Study Instrument process 



Internal Measurement 
Consistency Error Reliability 



Content Structural 
Validity Validity 



Hypotheses 
Testing Responsiveness 



Catalan 

Nieto et al, [25] 
Chinese 

Chiu et al. [26] 
Lee et al. [27] 
Dutch 

Jorritsma et al. [19] 

Pool et al. [29] 
Schmitt et al. [30] 
Vos et al. [31] 
English 
Stewart et al. [33] 
Finnish 
Salo et al. [35] 

French 

Forestier et al. [18] 
Martel et al. [37] 
Wlodyka et al. [36] 



Wlodyka et al. [20] 



German 

Bremerich et al. [24] 
Scherer et al. [38] 

Greek 
Trouli et al. [39] 

Hindi 
Agarwal et al. [40] 

Iranian 
Mousavi et al. [41] 

Italian 

Monticone et al. [42] 
Korean 

Lee et al. [43] 

Spanish 

Andrade et al. [46] 
Gonzalez et al. [44] 
Kovacs et al. [23] 



Swedish 

Ackelman et al. [22] 
Turkish 

Asian et al. [47] 



NDI 

NPQ 
NPQ 

NDI 
NPDS 
NDI 
NBQ 
NDI 

CNFDS 

NDI 
NPDS 

CNFDS 
NBQ 
NDI 
NPDS 
NPQ 
NDI 
NPDS 
NPQ 

NPDS 
NPDS 

NDI 

NPDS 

NDI 
NPDS 

NPDS 

NDI 
NPDS 

NDI 
NPQ 
NDI 
NPQ 
CNQ 

NDI 

NDI 



poor 
poor 

fan 
excellent 



poor 
poor 

poor 
poor 
poor 
poor 
poor 



fan 
poor 

good 

fair 

excellent 
excellent 

poor 

poor 
poor 



poor 
excellent 



excellent 



excellent 



good 



poor 



poor 



excellent 
excellent 

poor 

poor 
poor 
poor 



excellent 

good 

poor 

fair 
fair 

fair 

fair 
poor 

faii- 
poor 
poor 
poor 
poor 



fair 



poor 
poor 
fair 
fair 
fair 



poor 
poor 
poor 



poor 

poor 
poor 



poor 
poor 

poor 



excellent 



poor 
poor 



poor 



fair 
fair 



poor 
poor 



poor 
poor 
poor 
poor 



good 
good 



fair 
fair 
fair 



poor 

poor 

poor 

fair 
fair 

fan- 
poor 
poor 

poor 
fair 
poor 

poor 

poor 

fan 



good 
good 



poor 



poor 
poor 



fair 
fair 



fair 



good 

fair 
fan- 



poor 



poor 
poor 



fair 
fan- 
fair 
fan- 



good 



fair 



poor 



poor 

fair 
fan- 
fair 
poor 
poor 
poor 
poor 

poor 
fan- 



poor 
poor 



fair 
poor 
fair 



poor 
moderate 



poor 
poor 
poor 



fan 



fair 
fair 



poor 
poor 

fair 
poor 
poor 
poor 
poor 
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Table 5 Methodological quality of each study per measurement property (Continued) 



Bicer et al. [21] 


NPDS 


poor 


poor 




poor 




Kose et al. [48] 


NDI 


fair 


poor 


fair 


poor 


fair 




NPDS 


fair 


poor 


fair 


poor 


fair 




NPQ 


fair 


poor 


fair 


poor 


fair 




CNFDS 


fair 


poor 


fair 


poor 


fair 



[..] = reference number 



questionnaire that has been translated in English [33]. 
The CNFDS was originally designed to measure disabil- 
ity in patients with neck pain [34]. The translation pro- 
cess is not described, so the quality of this process is 
unknown. There is limited positive evidence for the 
responsiveness of the CNFDS (AUC = 0.73) [33]. Many 
neck-specific questionnaires have originally been devel- 
oped in English. We advise to use one of these question- 
naires, preferably the NDI. (Schellingerhout JM, 



Heymans MW, Verhagen AP, De Vet HC, Koes BW, 
Terwee CB: Measurement properties of disease-specific 
questionnaires in patients with neck pain: a systematic 
review, submitted) 

Finnish 

The NDI and NPDS have been translated in Finnish 
[35]. The methodological quality of the translation pro- 
cess of these questionnaires is poor [35]. 



Table 6 Quality of the measurement properties per language and questionnaire 







Internal 


Measurement 




Content 




Structural 




Hypotheses 




Language 


Instrument 


Consistency 


Error 


Reliability 


Validity 


1 


Validity* 
2 3 


4 


Testing 


Responsiveness 


Catalan 


NDI 


++ 


na 


na 


na 


_ 


+ 




++ 


na 


Chinese 


NPQ 


? 


na 


+++ 


7 


na 






++ 


7 


Dutch 


NDI 


na 




+ 


na 


na 






na 


+ 




NPDS 


na 


? 


? 


na 


na 






na 


na 




NBQ 


? 


7 


+ 


na 


na 






? 


na 


English 


CNFDS 


na 


na 


na 


na 


na 






na 


+ 


Finnish 


NDI 


? 


na 


7 


na 








? 


na 




NPDS 


+++ 


na 


7 


na 




++ 




? 


na 


French 


NDI 


na 


7 


7 


na 




+ 






? 




NPDS 


na 


? 


? 


na 




+ 




+/- 


7 




NBQ 


na 


na 


7 


na 


na 






+/- 






NPQ 


na 


7 


? 


na 




+ 




+/- 


7 




CNFDS 


? 


na 


na 


na 


na 






na 


7 


German 


NPDS 


? 


? 


? 


na 




++ 




++ 


na 


Greek 


NDI 


? 


7 


? 


na 








na 




Hindi 


NPDS 


? 


7 


? 


7 


na 






+/- 


na 


Iranian 


NDI 


+ 


na 


+ 


7 


na 






na 


+ 




NPDS 


+ 


na 


+ 


7 






+ 


na 




Italian 


NPDS 


+ 


na 


+ 


na 




+ 




? 


na 


Korean 


NDI 


+ 


7 


? 


na 


na 






? 


? 




NPDS 


? 


? 


? 


na 


na 






? 


? 


Spanish 


NDI 


+ 


na 


7 


na 


+ 






+ 


+ 




NPQ 


? 


na 




na 


na 






? 


? 




CNQ 


? 


na 


? 


na 


na 






? 


? 


Swedish 


NDI 


na 


na 


7 


7 


na 






? 


na 


Turkish 


NDI 


? 


na 


++ 


na 


na 






+ 


+ 




NPDS 


? 


na 


+ 


na 


na 






? 


+ 




NPQ 


? 


na 


+ 


na 


na 






? 


+ 




CNFDS 


7 


na 


+ 


na 


na 






? 


+ 



+++ or — = strong evidence positive/negative result, ++ or - = moderate evidence positive/negative result, + or - = limited evidence positive/negative result, +/- 
- conflicting evidence, ? - unknown, due to poor methodological quality, na = no information available 
f the numbers reflect the number of factors that are mentioned in the underlying studies 
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There is moderate evidence that the NDI is not one- 
dimensional and that the NPDS has a 3-factor structure 
[35]. The result for internal consistency of the NDI is 
indeterminate, because the authors unjustly assume a 1- 
factor model [35]. There is strong positive evidence for 
the internal consistency of the NPDS (Cronbach a = 
0.82-0.84) [35]. No floor or ceiling effects have been 
detected for the NDI or NPDS and for both question- 
naires differences in score between subgroups have been 
reported (stable vs. improved patients) [35]. 

The available information suggests that the Finnish 
NPDS has better measurement properties than the Fin- 
nish NDI. 

French 

The following neck-specific questionnaires have been 
translated in French: NDI, [20,36] NPDS, [20,36] NBQ, 
[37] NPQ, [20,36] and CNFDS [18]. The methodological 
quality of all these translation processes is poor 
[18,36,37]. 

There is limited evidence that the NDI has a 2-factor 
structure [20]. Hypothesis testing showed that the corre- 
lation of the NDI with an instrument measuring psycho- 
logical functioning is somewhat higher (r = 0.55), than 
with instruments measuring pain (r = 0.48), and physical 
functioning (r = 0.50) [20]. There is limited evidence 
that the NPDS has a 3-factor structure [20]. Hypothesis 
testing showed a positive result for correlation of the 
NPDS with instruments measuring pain (r = 0.52), and 
physical functioning (r = 0.63), and a negative result 
(results slightly below the pre-set criterion of r = 0.5) 
for correlation with instruments measuring psychologi- 
cal functioning (r = 0.40-0.49) [20]. Hypothesis testing 
showed a positive result for correlation of the NBQ with 
an instrument measuring pain and physical functioning 
(r = 0.61-0.67), and a negative result for correlation with 
an instrument measuring psychological functioning (r = 
0.17-0.25) [37]. There is limited negative evidence for 
the responsiveness of the NBQ (r = 0.42) [37]. There is 
limited evidence that the NPQ has a 2-factor structure 
[20]. Hypothesis testing showed a positive result for cor- 
relation of the NPQ with an instrument measuring phy- 
sical functioning (r = 0.53), and a negative result for 
correlation with an instrument measuring pain (r = 
0.43) [20]. 

No floor or ceiling effects have been detected for the 
NDI, NPDS, and NPQ [20,36]. The average time needed 
to fill out the NDI, NPDS, and NPQ is 7.4, 6.4, and 7.2 
minutes, respectively [36]. 

The lack of information derived from these studies 
makes it difficult to point out the best available neck- 
specific questionnaire in French. Based on the informa- 
tion available on the measurement properties of the 
original version of the NDI, NPDS, NBQ, NPQ, and 



CNFDS, we advise to develop a high quality translation 
of the NDI. (Schellingerhout JM, Heymans MW, 
Verhagen AP, De Vet HC, Koes BW, Terwee CB: 
Measurement properties of disease-specific question- 
naires in patients with neck pain: a systematic review, 
submitted) 

German 

The NPDS is the only neck-specific questionnaire that 
has been translated in German [24,38]. There are two 
translations of the NPDS in German: one translation 
process of poor and one of fair methodological quality 
[24,38]. 

Factor analysis provided moderate evidence that the 
NPDS has a 3-factor structure [38]. The result for inter- 
nal consistency is indeterminate, [38] because the 
authors unjustly assume a 1-factor model. There is 
moderate positive evidence for hypothesis testing (>75% 
of results in accordance with predefined hypotheses) 
[38]. No floor or ceiling effects have been detected for 
the NPDS [38]. 

The available information on measurement properties 
of the German NPDS looks promising, despite the poor 
methodological quality of the translation process. 

Greek 

The NDI is the only neck-specific questionnaire that has 
been translated in Greek [39]. The methodological qual- 
ity of the translation process is good [39] . 

Exploratory factor analysis provided moderate evi- 
dence that the NDI does not have a 1-factor structure 
[39]. The result for internal consistency is indetermi- 
nate, [39] because the authors unjustly assume a 1-fac- 
tor model. There is limited negative evidence for 
responsiveness (r = 0.30 with Global Rating of Change) 
[39]. 

Based on the good quality of the translation process 
and the negative results for unidimensionality and 
responsiveness, we advise to perform a cross-cultural 
validation of the Greek NDI. 

Hindi 

The NPDS is the only neck-specific questionnaire that 
has been translated in Hindi [40]. The methodological 
quality of the translation process is fair [40]. 

Hypothesis testing showed a positive result for corre- 
lation of the NPDS with an instrument measuring psy- 
chological functioning (r = 0.80), and a negative result 
for correlation with an instrument measuring pain (r = 
0.30), and an instrument measuring physical functioning 
(r = 0.15). The average time needed to fill out the 
NPDS was 8 minutes [40]. 

Based on the information derived from this study, we 
advise to develop a high quality translation of the NDI. 
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Iranian 

The NDI and NPDS have been translated in Iranian 
[41]. The methodological quality of the translations pro- 
cesses is excellent [41]. 

There is limited positive evidence for the internal 
consistency (Cronbach alpha = 0.88, assuming a 1-fac- 
tor structure), reliability (ICC = 0.97), and responsive- 
ness (r = 0.65 for physical functioning and r = 0.70 
for pain) of the NDI [41]. Exploratory factor analysis 
resulted in limited positive evidence for a 4-factor 
structure of the NPDS [41]. There is limited positive 
evidence for internal consistency (Cronbach alpha = 
0.75-0.94 for the four subscales), and reliability (ICC = 
0.97) [41]. There is limited negative evidence for 
responsiveness of the NPDS, because correlation with 
change scores on instruments measuring the same 
constructs was lower than correlation with instru- 
ments measuring other constructs [41]. No floor or 
ceiling effects have been detected for the NDI or 
NPDS [41]. 

The Iranian NDI and NPDS both seem to have ade- 
quate measurement properties, but we advise using the 
NDI, based on the negative result for responsiveness of 
the NPDS and the good measurement properties of the 
original version of the NDI. (Schellingerhout JM, Hey- 
mans MW, Verhagen AP, De Vet HC, Koes BW, Ter- 
wee CB: Measurement properties of disease-specific 
questionnaires in patients with neck pain: a systematic 
review, submitted) 

Italian 

The NPDS is the only neck-specific questionnaire that 
has been translated in Italian [42]. The methodological 
quality of the translation process is poor [42]. 

There is limited evidence that the NPDS has a 3-fac- 
tor structure (variance = 63%) [42]. A confirmatory ana- 
lysis with 4 factors showed a small improvement in 
variance (67%) [42]. Assuming a 3-factor structure, 
there is limited positive evidence for internal consis- 
tency: Cronbach a was 0.92 for "neck dysfunction 
related to general activities", 0.86 for "cognitive-beha- 
vioral aspects", and 0.89 for "neck dysfunction related to 
activities of the cervical spine" [42]. There is limited 
positive evidence for the reliability of the NPDS (r = 
0.89-0.93) [42]. The average time needed to fill out the 
NPDS is 7.5 minutes [42]. 

The available information on measurement properties 
of the Italian NPDS looks promising, despite the poor 
methodological quality of the translation. 

Korean 

The NDI and NPDS have been translated in Korean 
[43]. The methodological quality of the translation pro- 
cesses is poor [43]. 



There is limited positive evidence regarding the inter- 
nal consistency of the NDI (Cronbach a = 0.92, assum- 
ing a 1-factor structure) [43]. No floor or ceiling effects 
have been detected for the NDI or NPDS and differ- 
ences in score between subgroups have been reported 
(neck pain patients vs. healthy persons) [43]. 

Lack of information makes it difficult to point out 
whether the Korean NDI or NPDS has the best mea- 
surement properties. Based on the information available 
on the measurement properties of the original version 
of the NDI and NPDS, we advise to use the Korean 
NDI. (Schellingerhout JM, Heymans MW, Verhagen AP, 
De Vet HC, Koes BW, Terwee CB: Measurement prop- 
erties of disease-specific questionnaires in patients with 
neck pain: a systematic review, submitted) 

Spanish 

The NDI, NPQ, and Core Neck Questionnaire (CNQ) 
have been translated in Spanish [23,44]. The CNQ was 
originally designed to measure outcomes of care in 
patients with non-specific neck pain [45]. The methodo- 
logical quality of the translation process of the NPQ is 
poor, [44] and of the NDI and CNQ is excellent [23]. 

There is limited positive evidence for a 1-factor struc- 
ture of the NDI and its internal consistency (Cronbach 
a = 0.89) [46]. Hypothesis testing showed a positive 
result for correlation of the NDI with an instrument 
measuring pain (r = 0.65), and an instrument measuring 
physical functioning (r = 0.89) [46]. There is limited 
positive evidence for the responsiveness of the NDI [46]. 
There is limited negative evidence regarding the reliabil- 
ity of the NPQ (ICC = 0.63) [44]. No floor or ceiling 
effects have been detected for the NDI, NPQ, or CNQ, 
and scores across different categories of pain intensity 
have been reported [23]. The average time needed to fill 
out the NDI and CNQ is 4.0 and 2.1 minutes, respec- 
tively [23]. 

Based on the available information, we advise to use 
the Spanish NDI. 

Swedish 

The NDI is the only neck-specific questionnaire that has 
been translated in Swedish [22]. The methodological 
quality of the translation process is unknown. No floor 
or ceiling effects have been detected for the NDI [22]. 

Based on the lack of information, we advise to per- 
form high quality studies to fill in the missing informa- 
tion on the measurement properties of the Swedish 
NDI. 

Turkish 

The following neck-specific questionnaires have been 
translated and evaluated in Turkish: NDI, [47,48] NPDS, 
[21,48] NPQ, [48] and CNFDS [48]. There are two 
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translations of the NDI in Turkish: one translation pro- 
cess was of excellent methodological quality, [47] and 
one of fair methodological quality [48]. There are two 
translations of the NPDS as well: one translation process 
was of poor methodological quality, [21] and one of fair 
methodological quality [48]. The translation processes of 
the NPQ and CNFDS are both of fair methodological 
quality [48]. 

There is moderate positive evidence for the reliability 
of the NDI (ICC = 0.86-0.98), [47,48] and limited posi- 
tive evidence for hypothesis testing (r = 0.66-0.73 with 
instruments measuring pain and/or disability) and 
responsiveness (r = 0.79, with a physician's assessment 
of health) [47,48] . There is limited positive evidence for 
the reliability (ICC NPDS = 0.81, ICC NPQ = 0.85, 
ICCcnfds = 0.84) and responsiveness (r NPDS = 0.79, 
r NPQ = 0.81, and r CNFDS = 0.65, with a physician's 
assessment of health on a scale of 0 to 100) of the 
NPDS, NPQ, and CNFDS [48]. 

The average time needed to fill out the NDI, NPDS, 
NPQ, and CNFDS is 8.8, 10.2, 8.4, and 6.8 minutes, 
respectively [48]. All 4 translated questionnaires show 
promising results, but we advise using the NDI, because 
of the excellent methodological quality of the translation 
process and the good measurement properties of the 
original version. (Schellingerhout JM, Heymans MW, 
Verhagen AP, De Vet HC, Koes BW, Terwee CB: Mea- 
surement properties of disease-specific questionnaires in 
patients with neck pain: a systematic review, submitted) 

Discussion 

Translated versions of neck-specific questionnaires have 
been evaluated in 15 different languages. Generally the 
methodological quality of the translation process is 
poor, which was mainly due to the fact that the trans- 
lated version was not pre-tested in the target population. 
Furthermore, none of the included studies performed a 
cross-cultural validation. This is necessary to evaluate 
whether the constructs underlying the original question- 
naire are represented adequately by the questionnaire 
items in the new language. For each questionnaire, 
except for the Iranian NPDS and Spanish NDI, at least 
half of the information regarding measurement proper- 
ties was lacking. Moreover, the evidence for the quality 
of measurement properties of the translated versions is 
mostly limited, due to methodological shortcomings of 
the included studies. 

The COSMIN checklist has recently been developed 
and is based on consensus between experts in the field 
of health status questionnaires [9]. The COSMIN check- 
list facilitates a separate judgment of the methodological 
quality of the included studies and their results. This is 
in line with the methodology of systematic reviews of 
clinical trials [15]. The criteria in Table 2 are based on 



the levels of evidence as previously proposed by the 
Cochrane Back Review Group [16]. The criteria are ori- 
ginally meant for systematic reviews of clinical trials, but 
we believe that they are also applicable for reviews on 
measurement properties of health status questionnaires. 

Exclusion of non-English papers may introduce selec- 
tion bias. However, the leading journals, and as a conse- 
quence the most important studies, are published in 
English. So, research performed in populations with a 
different native language is generally still published in 
English. This is illustrated by the large number of arti- 
cles we retrieved regarding translations of neck-specific 
questionnaires (see Figure 1). Thus, we argue that the 
most important translations have been included in our 
study. 

Many studies showed similar methodological short- 
comings. Some methodological aspects that need to be 
improved are: assessment of unidimensionality in inter- 
nal consistency analysis, the use of stable patients and 
similar test conditions in studies on reliability and mea- 
surement error, and studies on construct validity and 
responsiveness should be based on predefined hypoth- 
eses. We do not discuss these flaws here, because we 
have elaborated on this subject in a separate paper. 
(Terwee CB, Schellingerhout JM, Verhagen AP, de Vet 
HC, Koes BW: Assessing the measurement properties of 
neck disability questionnaires: room for improvement, 
submitted) 

We pooled the results per language, which neglects 
the fact that populations might share the same language, 
but differ in cultural context [3]. However, we think that 
this did not affect our results, because the only inconsis- 
tency in results for the same language version was 
found for the Chinese NPQ and the populations in the 
two studies evaluating the Chinese NPQ came from the 
same region in China and were similar in context 
[26,27]. 

A systematic review of the measurement properties of 
the original version of neck-specific questionnaires 
showed that for each questionnaire, except for the NDI, 
at least half of the information regarding measurement 
properties was lacking. The available results were mainly 
positive, but the evidence was mostly limited. (Schellin- 
gerhout JM, Heymans MW, Verhagen AP, De Vet HC, 
Koes BW, Terwee CB: Measurement properties of dis- 
ease-specific questionnaires in patients with neck pain: a 
systematic review, submitted) This systematic review of 
translated questionnaires shows similar findings, except 
that the results for construct validity and responsiveness 
are more frequently inconsistent or negative. These 
inconsistencies are in correspondence with those found 
for translations of the McGill Pain Questionnaire [7]. A 
possible explanation for this difference in results 
between original questionnaires and their translated 
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counterparts is the poor methodological quality of the 
translation process and/or lack of cross-cultural valida- 
tion [3,4]. 

A poor translation process and/or lack of cross-cul- 
tural validation seem to primarily affect the validity of 
the questionnaire. This is illustrated by the differences 
found between the results for structural validity of the 
translated versions and their original counterparts, and 
the negative/inconsistent results for hypothesis testing 
of the translated questionnaires. This is not surprising, 
as the importance and/or meaning of questionnaire 
items (e.g. driving, depressed mood) may depend on set- 
ting and context. So, a simple translation of the original 
questionnaire is not sufficient and might affect the 
underlying constructs. The translation process does not 
seem to affect the reliability of the questionnaire. This is 
illustrated by the fact that 95% of the results for internal 
consistency and reliability are positive, regardless of the 
methodological quality of the translation process. 

A recent review concluded that the translated versions 
of the NDI into Brazilian-Portuguese, Dutch, French, 
Korean, and Spanish are of high quality [6]. A possible 
explanation for discrepancies with our findings is that 
the methodological quality of the translation process 
was not taken into account in that review. The same 
accounts for a state-of-the-art review of the NDI, in 
which a list of available translations is recommended, 
without critical appraisal of the quality of the translation 
process and cross-cultural validation, nor the quality of 
the measurement properties [5]. 

This study evaluates the measurement properties of 
translated versions of neck-specific questionnaires, 
thereby providing an overview of their availability and 
making it possible to choose the best questionnaire for 
a specific study population. However, it is advisable to 
use them cautiously, since the evidence is mostly lim- 
ited and for each of these translations, except for the 
Spanish NDI, at least half of the information regarding 
measurement properties is lacking. For clinical 
research and practice we advise to use the following 
questionnaires: the Catalan, Dutch, English, Iranian, 
Korean, Spanish and Turkish version of the NDI, the 
Chinese version of the NPQ, and the Finnish, German 
and Italian version of the NPDS. This is based on the 
available results for the measurement properties of 
these translations, and in the case of the Dutch, Eng- 
lish, and Korean NDI on the measurement properties 
of the original version. (Schellingerhout JM, Heymans 
MW, Verhagen AP, De Vet HC, Koes BW, Terwee CB: 
Measurement properties of disease-specific question- 
naires in patients with neck pain: a systematic review, 
submitted) The Greek NDI needs cross-cultural valida- 
tion and due to poor methodological quality of the 
available study there is no information on the Swedish 



NDI. For all other languages it is advisable to first 
choose the best available original version of the neck- 
specific questionnaires and perform a high quality 
translation of this questionnaire. Our previous sys- 
tematic review on the original versions of all neck-spe- 
cific questionnaires showed that the NDI was the best 
questionnaire. (Schellingerhout JM, Heymans MW, 
Verhagen AP, De Vet HC, Koes BW, Terwee CB: Mea- 
surement properties of disease-specific questionnaires 
in patients with neck pain: a systematic review, 
submitted) 

For future research we recommend performing high 
quality studies to fill in the information on the unknown 
measurement properties. 

Conclusion 

Translated versions of neck-specific questionnaires have 
been evaluated in 15 different languages. Generally the 
methodological quality of the translation process is poor 
and none of the included studies performed a cross-cul- 
tural validation. A substantial amount of information 
regarding the measurement properties of translated ver- 
sions of the different neck-specific questionnaires is still 
lacking or assessed in studies of poor methodological 
quality. As a result the available evidence on the mea- 
surement properties is mostly limited. So, it is advisable 
to use the available translated questionnaires cautiously. 
For the time being we advise to use the following ques- 
tionnaires in clinical research and practice: the Catalan, 
Dutch, English, Iranian, Korean, Spanish and Turkish 
version of the NDI, the Chinese version of the NPQ, 
and the Finnish, German and Italian version of the 
NPDS. The Greek NDI needs cross-cultural validation 
and there is no methodologically sound information for 
the Swedish NDI. Studies of high methodological quality 
are needed to fill in the unknown measurement 
properties. 

For all other languages we advise to translate the ori- 
ginal version of the NDI. 
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